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Abstract 

Seymour’s distance two conjecture states that in any digraph there 
exists a vertex (a “Seymour vertex”) that has at least as many neigh¬ 
bors at distance two as it does at distance one. We explore the va¬ 
lidity of probabilistic statements along lines suggested by Seymour’s 
conjecture, proving that almost surely there are a “large” number of 
Seymour vertices in random tournaments and “even more” in general 
random digraphs. 
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1 Introduction 


1.1 Notation 

For the purpose of this paper, a digraph exclusively means a simple, directed 
graph without loops or multiple edges (including edges in the same direction 
and antiparallel edges). 

For any pair of vertices in a digraph D, the length of the shortest 
directed path from u to u in D is denoted as dist{u,v). We write Ni{u) to 
denote the set of vertices that are at distance i from u. A vertex vq E V{D) 
is called a Seymour vertex if |iV 2 ('yo)| > |-^i('yo)|- We write S for the set of 
all Seymour vertices in the digraph. 

1.2 Background: Seymour’s Conjecture 

Conjecture 1.1. (Seymour’s Second Neighborhood Conjecture). If D is a 
directed graph with no loops or multiple edges, then D has a vertex Vq such 
that |iV 2 (uo)| > l^i(w)|- 

Although the proof of this conjecture remains open, several partial results 
have been proved over the last two decades: 

Theorem 1.2. (Kaneko and Locke J^) Seymour’s conjecture is true if the 
minimum outdegree of vertices in D is at most 6. 

Dean’s Conjecture Seymour’s conjecture is true if D is any tournament 
T. 

Theorem 1.3. (Fisher Dean’s conjecture is true. 

Chen, Shen, and Yuster [3] have shown that for every digraph, there is 
a vertex v such that |iV 2 (u)| > r|iVi('y)|, where r ^ .657, and they state a 
further improvement to r .678. See the website [5] for details. Seymour’s 
conjecture may be seen as a special case of a more general 1988 conjecture 
of Caccetta and Haggkvist: 

Caccetta-Haggkvist Conjecture ^ If D is a simple digraph on n ver¬ 
tices, and each vertex has outdegree at least d, then the girth of D (the length 
of the shortest directed cycle) is at most n/d. 
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The Caccetta-Haggkvist conjecture been proved for d = 2, 3,4, 5, See Dou¬ 
glas West’s website HDl for several related results pertinent to the conjecture. 
The truth of Seymour’s conjecture would settle the important “balanced” 
d = I case of the Caccetta-Haggkvist conjecture, i.e., when each vertex has 
in- and out degree at least d = |. A short proof of this fact follows in the 
case 3|n: 

We need to prove that D has a directed triangle. We let u be a Sey¬ 
mour vertex, and note that the other vertices separate themselves out into 
A^i(w), |A^i(w)| > n/3; A^_i(n), |A^_i(w)| > n/3, (where N_i{v) consists of 
those vertices that point towards n); and A"o(n), |A"o(n)| < n/3, which are 
those vertices that have no edge to or from v. Now if there is an edge from 
u e Ni{v) to w E N_i{v), then v, u, and w create a directed triangle, and we 
are done. On the other hand, if there is no such edge, then vertices in N~{v) 
cannot be at distance two, forcing all distance two vertices to be in No{v), 
which leads to the contradiction that 


|A^ 2 (^^)| < |iVo(^^)| <n/3< |iVi(^^)|. 


□ 

In this papei0, we study the number S = Sn = Sn,p of Seymour vertices in 
random tournaments and random digraphs. Actually, our proofs will reveal 
that Nate Dean’s conjecture, proved by Fisher in [5], is very insightful; in 
particular, we will see that there are many more Seymour vertices in ran¬ 
dom digraphs with p < 1/2 (definitions below) than there are in random 
tournaments, and the tightness of the concentration is greater in the former 
case. 

Specihcally, it is shown in Section 2 that there are close to | Seymour 
vertices in random tournaments with high probability, where “close to” and 
“with high probability” are interpreted in a variety of ways. In particular, 
both convergence in measure and almost everywhere convergence are invoked. 
An interesting variance computation in this section shows that there is an 
oscillation in the number of Seymour vertices as we add additional vertices 
to the tournament, and this reflects itself in the piecewise linear “even-odd” 
dichotomy in the variance of the number of Seymour vertices. After methods 
such as the exponential inequalities of Azuma and Talagrand failed, we used 
skeletal subsequences of polynomial size (along with an analysis of maximal 

^This work was started by the first three authors, reported on at [9], and completed 
this year by Godbole and Zhang. 
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deviation between these checkpoints) to establish inequalities that yield the 
almost everywhere convergence referred to above. 

In Section 3, we consider random digraphs on n vertices, and show that 
the probability that every vertex is a Seymour vertex tends to 1 as n ^ cxo, 
provided that the edge probability p satishes o(l) < p < | — o*(l) for well- 
specihed o(l) and o*(l) functions. 


2 Random Tournaments 


An orientation of graph G is a digraph D obtained from G by choosing an 
orientation {u ^ v or v ^ u) for each edge uv G E{G). A tournament is an 
orientation of a complete graph K^. Our model for a random tournament 
is the probability space of all possible orientations of the complete graph 
chosen in an equiprobable fashion. Equivalently, the orientation of each edge 
uv G E{Kn) is chosen independently as u ^ v oi v ^ u with probability 
1 / 2 . 

Proposition 2.1. Let Tn he a random tournament and S the set of its Sey¬ 
mour vertices. Then as n ^ oo 

71 

E (|^|)~-(1 + 0 ( 1 )) 

as n ^ oo. 


Proof. Let X := [S'!, and for i G [n], dehne 




1 vertex i is a Seymour vertex 
0 otherwise 


n 

so that X = Xj. 

i=l 


By linearity of expectation. 


E(X) = nP(l G S) 

= nP(lG^;|iVi(l)| + |iV 2 (l)|=n-l) 
+nP(l G |iVi(l)| + |Ar 2 (l)| < n - 1) 
< nP(lG^;|iVi(l)| + |iV 2 (l)|=n-l) 
+nP(|iVi(l)| + |iV 2 (l)| <n-l). 


and 


( 1 ) 


E(X) > nP(lG^;|iVi(l)| + |iV2(l)|=n-l) 

= nP(|iVi(l)|<(n-l)/2;|iVi(l)| + |iV 2 (l)|=n-l). (2) 
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By (1), 


E(X) 


< nF{l e S; |iVi(l)| + |iV 2 (l)| = n - 1) + n{n 


l)F{dist{l,2) > 3) 


since to have dist{l, 2) > 3, the edge 1^2 must be absent. Furthermore, 
for any vertex i G {3,4 • • -n}, 1 —)■ i and i —>■ 2 cannot both be present. 

The second term is exponentially small and, in the hrst term, P(|A^i(l)| < 
is clearly | if n is even; if n is odd, then P(|A^i(l)| < = ^ + 

iP(|A^i(l)| = i + I ^ Stirling approximation gives 

P(|iVi(l)| = ^)~y^, so that 

^ + if n is even 

'^“[ 1(1 + 02 ( 1 )) if n is odd. 


where Oi(l) is exponentially small and 02 ( 1 ) = 0{l/y/n). Since P(|iVi(l)| + 
|7V2(1)| = n - 1) = 1 - P(|iVi(l)| + |iV 2 (l)| < n - 1) > 1 - (n - 
(2) gives 


E(X) > nP(|iVi(l)| <{n- l)/2) 


— n{n — 



A similar analysis as above now establishes the result. 


□ 


The difference in the o(l) functions in the above result proves to be highly 
signihcant - one of its immediate ramihcations, seen in the next proposition, 
is that the variance of the number of Seymour vertices grows linearly, but in 
a piecewise fashion. Other less obvious complications might indeed be caused 
by this “difference in the even and odd cases.” 


Proposition 2.2. Let Tn be a random tournament and S be the set of its 
Seymour vertices. Then for constants Ci and C 2 , Par(|S'|) ~ Oin(l + o(l)) 
as n ^ 00 if n is even, and l/ar(|S'|) ~ 02n(l + o(l)) as n ^ 00 if n is odd. 

Proof. Since 


Var{X) = 5^[E(Xi) - E2(X,)] + 2 5^[E(X,X,) - E(X,)E(X,-)], 

2=1 i<j 
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and E{XiXj) = E{XiX 2 ) = P(l,2 e ^) = 2P(1,2 e 1 ^ 2), the key 
term in the above display for the variance is given by P(l, 2 G S'; 1 —?■ 2) = 
Pi + P 2 + P3 + P4, where 


Pi — P(l, 2 G S'; 1 ^ 2; Ai n A 2 ), 

P2 = P(l,2G^;1^2;klfnkl2), 

Ps = P(l, 2 G S'; 1 —y 2; Ai H A^)-, 

and 

P4 = P(l, 2 G S'; 1 —)■ 2; A^ fl A 2 ), 

and Ai, i = 1, 2, are the events that all vertices are at distance no more than 
2 from vertex i. Since P 2 ,P 3 ,P 4 < P(Af) < |(3/4)"“^, pi is the dominant 
term. 

( 77 — 1 77 — 1 \ 

1 ^ 2; |/Vi(l)\{2}| < — - 1; |A'2(2)\{1}| > — “ ij 
= ip (|/V,(1)\{2}| < - 1 j X P (^|Ar;(2)| < ^ 



where Ai*(2) is the hrst neighborhood of vertex 2 in the set {3,4, 

Notice that if n is even, the hrst term above equals | — |P(Bin(?7, — 2, 0.5) = 
and the second equals |+|P(Bin(?7,—2, 0.5) = ^^). If n is odd, however, 
the hrst factor is exactly 1/2, while the second equals | + P(Bin(n —2, 0.5) = 

lizzie 

2 

Let n be even. Then, considering the proof of Proposition 2.1 and denot- 
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ing by Oi(l) a generic function that decays exponentially, we have that 


Var{X) = ^[E(X,)-E2(X,)] + 2^[E(X,X,)-E(X,)E(X,-)] 


2=1 


i<j 




+n^ ^Bin(n — 2, 0.5) 


n — 2 


+2p2 + 2p3 + 2p4 — ( - + Oi(l) 


= n \ - 


n 


4 / 27171 


(l + o(l)) + Oi(l) 


= (1 + 0 ( 1 )). 


since P(Bin(n — 2, 0.5) = ^^^) ~ y ~ by Stirling’s approximation. 
If n is odd, we have, on the other hand. 


Var{X) = 5^[E(Xi) - E2(X,)] + 2 ^[E(X,X,) - E(X,)E(X,)] 


2=1 


i<j 


< n ( ^ - ^P2 ('Bin(n-1,0.5) = ^ ^ 


+n^ ^4 2^ ^Bin(n — 2, 0.5) 


2 

n — 1 


/I 1 / n — 1 

+2p2 + 2p3 + 2p4 - f - + -P f Bin(n - 1,0.5) = —^ 

= ^(1+ o(l)) + y(vr2-TTi) - —TTi +Oi(l), 


where 


and 


n — 1 


TTi = P ^Bin(?7, — 1, 0.5) = —- 


7r2 = P ( Bin(n — 2, 0.5) = 


n — 1 
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Since 


TT2 — TTi ~ 



VT \yjn — 2 ^Jn — 1 


2/1 1 



(1 + 0 ( 1 )), 


and 



it follows from (4) that 



) (1 + 0 ( 1 )) 


(5) 


in the odd case. It is now straightforward to get matching lower bounds of 
the same order of magnitude as in (3) and (5). This proves the result. □ 

A natural question to ask is why there isn’t a uniform growth rate for the 
variance. Here is a heuristic reason: Even though the expected values in both 
the even and odd cases are ~ n/2, the second order terms are signihcant. 
Suppose we have observed the tournament with an even number of vertices. 
By Stirling’s formula, about Cy/n of the vertices v are “borderline Seymour,” 
meaning that i{v) — o{v) = 1 and about Cy/n are borderline non-Seymour, 
i.e., satisfy o{v) — i(v) = 1 - where i(-) and o(-) are the in- and out-degree 
functions. When a new vertex “joins” the tournament, notice that we can¬ 
not lose Seymour vertices, but borderline non-Seymour vertices have a 0.5 
chance of becoming borderline Seymour, with i{y) = o{v). There is thus an 
increase in the Eds'!) by ~ {C/2)y/n, an increase that almost gets nullihed 
when a second new vertex joins the tournament [n becomes even again) and 
borderline Seymour vertices become borderline non-Seymour. This dynamic 
evolution of the number of Seymour vertices causes an ebb and flow in the 
variance also, as reflected by Proposition 2.2. 

Proposition 2.3. As n goes to infinity, 


(I 


P(||^|-E(|5|)| -)■ 0. 


Proof. Immediate from Chebychev’s inequality and Propositions 2.1 and 2.2, 
which indicate that for any A > 0, 



for some constant K. 


□ 
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Discussion. The above rate of convergence is unsatisfactory; for reasons to 
be made clearer, we would like to have a summable upper bound on the prob¬ 
ability in (6). This is equivalent to finding an exponential inequality to bound 
the probability. Accordingly, we first attempted to use Azuma’s inequality 
as found in [1] . Here it turns out that a change in the orientation of a single 
edge can, in the worst case scenario, change the value of S quite dramatically. 
However, it can be shown that if the tournament is of diameter 2, and if a 
change in any edge orientation does not change the diameter, then S cannot 
change by more than 2 and we have a 2-Lipschitz situation. The probability 
of this, moreover, can be shown to be 1 — Sn, where is exponentially small. 
A modified version of Azuma’s inequality, in which such small exceptional 
probabilities are allowed, may be found in |1], Theorem 2.37 - but this too 
proves to give us a width of concentration of Q{n) around E(S') ~ n/2, since 
there are a quadratic number of edges in the “edge exposure martingale.” 
Using the vertex exposure martingale vastly changes the maximal change in 
S, and thus provides no improvement. Likewise, Talagrand’s inequality |T] 
involves a very large linear certification function, and is similarly unable to 
squeeze out a better upper bound in (3). We thus resort to “Chebychev’s in¬ 
equality on blocks” to prove the next result. (Azuma’s inequality on blocks, 
as methodically exploited by Frieze (see, e.g., 0 ). could conceivably be used 
also.) 

Theorem 2.4. For each e > 0, 

IP (|*S'n — E(S'„)| > inhnitely often) = 0. 

Proof. Let us prove equivalently that for each e > 0, 

P (^\Sn — E(-S'„)| > n°'^+Hogn inhnitely often) = 0, 
illustrating the method for e = 1/4. We have, by Chebychev’s inequality, 

P (|S'„2 - E(S'„2)| > n^/^logn) < 

ulog n 

for some K > 0. Since the right side is summable, we use the Borel Cantelli 
lemma to argue as follows. First, we identify the class of tournaments on 
Z"*" with the unit interval [0,1] endowed with Lebesgue measure A. Then 
the Borel Cantelli lemma implies that the Lebesgue measure of those tour¬ 
naments for which |S '„2 — E(S'ji 2 )| > logn occurs inhnitely often is zero. 
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or, equivalently that for each tournament T on Z"*" outside of an exceptional 
set of measure zero, there exists N = N{T) such that 

n > N{T) Sn '2 e [E(S'„ 2 ) — logu, E(S'„ 2 ) + logn]. 


The goal is to show that the maximal term between the “checkpoints” de¬ 
termined by the subsequence a„ = cannot be too badly behaved. For any 
N G Z"*", denote by Tjv the tournament induced on N by T. For 1 < i < 
let li and Oi be respectively the in- and out-degrees of vertices in T„ 2 , and 
let /' and O' be the in- and out-degrees of vertices in {1,2,... to the 
“new” vertices -|- 1,..., j}, where n^-l-1 < j < (n-l-1)^ — 1. By Azuma’s 
inequality, 

p|^pjA-0,| > < u2p(|/i-Oi|>A) 

< 2n^exp{-A^/8n^}, 


so that 


:= p U 


k2=l 


A similar analysis yields 


Oi\ > n-y/dOlog 




P(5^) :=p 1J|J'-0'| > VSOnlogn <-. 


\ 2=1 / 

Finally P(O^) := P((imm(T„ 2 ) > 3) is exponentially small. 

..., (n + 1)^ - 1, 


Thus, for j = 


P(|S', -E(S',)| >//Mogj;|^„2-E(A„2)| <n3/2logn) 

< P (iFj — E(S'j)| > logn; |S '„2 — E(5'„2)| < n^^^ logn) 

< nc,) + d (7) 


where 


= {\^j - ®(>S'i)| > 2n^'^^ logn; |S '„2 - E(S'„ 2 )| < n^/^ logn; A„, 0„}. 

Now if diam{T]sf) = 2, it follows that a vertex j is Seymour iff Oj < Ij. 
Thus, if originally the number of Seymour vertices is within n^^^ log n of 
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E(S'„ 2 ), and if \Sj — E(S'j)| > 2n^/^logn, then what may have caused this? 
Note that |E(5'j) — E(S'„ 2 )| < Kn for some constant K and for each j = 
+ 1,..., (n + 1)^. Also, we may assume (as a worst case scenario) that the 
linear number of “new” vertices cause a change to the Seymour status of Tj 
by an amount equal to their magnitude. This means that for large n, at least 
log?7,)/2 of the original m? vertices must have “switched” their Seymour 
status. Now since \Ij — Oj\ < ni/40 logn and |/j — 0)1 < y/80n logn, no j 
with 

\/ 80nlog n < \Ij — Oj\ < logn 

can switch. Now for some L > 0, 

P(|A-0,| =r) < - 

n 


for each r in [0, \/80rnog7i|, so that the expected number of i’s that switch 
is no more than • L^/80n \ogn/n < \/log ^ for some M. More¬ 

over, since the numbers of these i’s that switch are independent, we have 
a high concentration of the number of vertices that switch around the ex¬ 
pected value. The probability that more than {n^^‘^\ogn)/2 switch is thus 
exponentially small. It follows from (7) that ^ and thus that 
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P {\S, - E(S',)| > log j; |A „2 - E(A„ 2 )| < logn) < -, 


so that 

(n+lf-l 

Y, P {\Sj - E(A,)| > f/Hogj; |A„2 - E(A„2)| < n^/Hogn) < -, 
which yields 

oo (n+l)^ —1 

^ P(|S',-E(S',)|>//Mogj;|A„2-E(A„2)|<n3/2logn)<cx), 

n=l j=rfi 

SO that, with “i.o.” representing “infinitely often for n = 1,2,... and j G 
K,...,(n +1)2-1}” 

P(|S', -E(A,)| >//Mogj;|A„2-E(A„2)| <n3/2logni.o.) =0. 
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Since 


P (|S '„2 — E(S'„ 2 )| > logn infinitely often) = 0, 

Theorem 2.4 follows, with e = 1/4. The case of general e follows by taking 
larger and larger subsequences; in fact for arbitrary e > 0 we start the sub¬ 
sequence N = and an estimate on P(|S'Ar — E(S' 7 v)| > logn). 

There are “new vertices,” and the proof exploits the difference be¬ 
tween J, O and r, O' as above. □ 

Notice that we never really proved an exponential inequality above; rather, 
we were able to show that the conclusion of the Borel Cantelli lemma held 
for deviations of the form P(|S'„ — E(S'„)| > However, we believe: 

Conjecture 2.5. Theorem 2.4 can he improved to assert that for some K, 

OO 

y^P - E(S'„)| > K^/n\og nj < oo. 

n=l 

The proof might involve exponential rather than polynomial subsequences. 


3 Random Digraphs 


In this section, we consider random digraphs D{n,p) dehned as follows: for 
each pair of vertices {u,v) G V{D), we place an arc from m to n with proba¬ 
bility p < 1/2; similarly, we place an arc from n to m with probability p. This 
construction gives no anti-edges, and the probability that there is no edge 
between u and n is 1 — 2p. We allow for the case that p = —?■ 0 slowly as 

77, —)■ cx) or that p = Pn = 0.5 — o(l). In order to study the behavior of the 
number of Seymour vertices, we need the following concentration inequality 
from [1]. 


Lemma 3.1. Let X be a sum of independent indicator random variables. 
Then for any e > 0, 


P(X > (1 + e)E(X)) < 


E(X) 


(1 -1- e)i+^ 


Theorem 3.2. Let D{n,p) be a random digraph on n vertices with probability 

< P < ^ — dn, where e > is arbitrary and 5n ^ ^ is specified 
below. Let S be the set of its Seymour vertices. Then Eds'!) = n — o(l), n — )■ 

cxo. 
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Proof. Let X = |S'|. We have 

E(X) = nP(l e S) 


> nP(le^;|iVi(l)| + |iV 2 (l)|=n-l) 


= nP (^|iVi(l)| < 

> nP (^|iVi(l)| < 

> nP (^|iVi(l)| < 
= nP(|W(l)|< 


n 


2 

n — 1 
2 

n — 1 
2 

n — 1 


; |W(1)| = 0 for alH > 3 
-nP(|iVi(l)| + |iV 2 (l)| <n-l) 
— n{n — 1)(1 — p)(l — 

- 0 ( 1 ), 


( 8 ) 


provided that p > ^(2 + e)i^f^. 
But, by Lemma 3.1, 

nP f|iVi(l)| < 


n 


= nP ( Bin(? 7 ,,p) < 


P ( Bin(n,p) 


> n 1 


> n — n 


2pe 

g2p 


> 


n 


n/2 


(9) 


Now the function ip{p) = n tends to zero for each hxed p E (0,1/2), 

but on letting p —)■ 1/2 and setting {^) = 1 — e„, we see that the right 
side of (9) is of the form n — = n — o(l) if = (2 + 77 ) logn/n, 

where p > 0 is arbitrary. Thus by (8) and (9) we have E(X) > n — o(l) if 

\J{2. + < P < 0.5 — 5n for a that may be computed explicitly. This 

proves the result. □ 


Corollary 3.3. Let D{n,p) be a random digraph with p as in Theorem 3.2. 
As n goes to infinity, D has exactly n Seymour vertices with high probability. 

Proof. Suppose [S'! < n — 1 with some probability g > 0. Then Eds'!) < 
(n — l)g + n(l — q) = n — q, which contradicts the fact that Eds'!) > n — o(l) 
as proven in Theorem 3.2. Notice that our approach will also allow us to 
squeeze out results along the lines of an assertion that states that for an 
inhnite tournament T, PdS'„| < n — 1 infinitely often) = 0. □ 
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